Mining the ESROM: A study of breeding value prediction in Manchego sheep by means of classification techniques plus attribute selection and construction

نویسنده

  • Jose A. Gámez
چکیده

Manchego sheep is the native breed in Castilla-La Mancha (a region of Spain). Its two main products are Manchego cheese and Manchego lamb, representing more than 50% of the final animal production in the region. Because of these economical implication and with the aim of improving Manchego sheep production, a selection scheme (called ESROM) based on the animal genetic merit was started fifteen years ago. One of the major points in the selection scheme is the estimation of the breeding value, and its use in flock replacements. In the ESROM scheme, the breeding value is estimated by using BLUP animal model, which is a complex method based on relating different traits by linear equations, and solving the system by simultaneously taking into account all the available information. In this paper we study the use of data mining techniques to deal with breeding value classification. The goal of the paper is far enough of replacing BLUP in breeding value estimation, on the contrary, our goal is to learn in a supervised way from the results produced by BLUP, and to use the learned models to provide preliminary information about the breeding value of an animal. The advantages of using those models is that few information is required and the estimation can be done as soon as the data (about a few variables) is ready for a given animal, allowing to take early decisions or to delay them until a deeper study is carried out. We start the data mining process identifying a proper data set from the whole available data. Then we use standard classification techniques combined with feature subset selection to identify good attribute subsets to be used as predictors. Attribute selection is done on the basis of filter and wrapper algorithms, and we also proposed a filter+wrapper algorithms which provide close to wrapper results with a remarkable smaller computational cost. We also show that the classifiers accuracy can be considerably improved (around a 4% on the average) by using attribute construction. Finally we discuss about some tasks performed in the ESROM scheme in relation with the obtained classification models. Keyworkds: Manchego sheep, selection scheme, breeding value, classification algorithms, data mining, attribute selection, attribute construction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...

متن کامل

Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...

متن کامل

Customer Retention Based on the Number of Purchase: A Data Mining Approach

Purpose: this study wants to find any relationship between the numbers of purchase and the income the customer brings to the company. The attempt is to find those customers who buy more than one life insurance policy and represent the signs of good payments at the same time by the help of data mining tools. Design/ methodology/ approach: the approach of this research is to use data mining tools...

متن کامل

Customer Behavior Mining Framework (CBMF) using clustering and classification techniques

The present study proposes a Customer Behavior Mining Framework on the basis of data mining techniques in a telecom company. This framework takes into account the customers’ behavior patterns and predicts the way they may act in the future. Firstly, clustering technique is used to implement portfolio analysis and previous customers are divided based on socio-demographic features using k</em...

متن کامل

Predicting the Next State of Traffic by Data Mining Classification Techniques

Traffic prediction systems can play an essential role in intelligent transportation systems (ITS). Prediction and patterns comprehensibility of traffic characteristic parameters such as average speed, flow, and travel time could be beneficiary both in advanced traveler information systems (ATIS) and in ITS traffic control systems. However, due to their complex nonlinear patterns, these systems ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005